240 research outputs found

    Using topic information to improve non-exact keyword-based search for mobile applications

    Get PDF
    Considering the wide offer of mobile applications available nowadays, effective search engines are imperative for an user to find applications that provide a specific desired functionality. Retrieval approaches that leverage topic similarity between queries and applications have shown promising results in previous studies. However, the search engines used by most app stores are based on keyword-matching and boosting. In this paper, we explore means to include topic information in such approaches, in order to improve their ability to retrieve relevant applications for non-exact queries, without impairing their computational performance. More specifically, we create topic models specialized on application descriptions and explore how the most relevant terms for each topic covered by an application can be used to complement the information provided by its description. Our experiments show that, although these topic keywords are not able to provide all the information of the topic model, they provide a sufficiently informative summary of the top- ics covered by the descriptions, leading to improved performance.info:eu-repo/semantics/publishedVersio

    Hybrid MM/SVM structural sensors for stochastic sequential data

    Get PDF
    In this paper we present preliminary results stemming from a novel application of Markov Models and Support Vector Machines to splice site classification of Intron-Exon and Exon-Intron (5' and 3') splice sites. We present the use of Markov based statistical methods, in a log likelihood discriminator framework, to create a non-summed, fixed-length, feature vector for SVM-based classification. We also explore the use of Shannon-entropy based analysis for automated identification of minimal-size models (where smaller models have known information loss according to the specified Shannon entropy representation). We evaluate a variety of kernels and kernel parameters in the classification effort. We present results of the algorithms for splice-site datasets consisting of sequences from a variety of species for comparison

    On Multiview Analysis for Fingerprint Liveness Detection

    Get PDF
    Fingerprint recognition systems, as any other biometric system, can be subject to attacks, which are usually carried out using artificial fingerprints. Several approaches to discriminate between live and fake fingerprint images have been presented to address this issue. These methods usually rely on the analysis of individual features extracted from the fingerprint images. Such features represent different and complementary views of the object in analysis, and their fusion is likely to improve the classification accuracy. However, very little work in this direction has been reported in the literature. In this work, we present the results of a preliminary investigation on multiview analysis for fingerprint liveness detection. Experimental results show the effectiveness of such approach, which improves previous results in the literatur

    SVM clustering

    Get PDF

    Top quark tensor couplings

    Get PDF
    We compute the real and imaginary parts of the one-loop electroweak contributions to the left and right tensorial anomalous couplings of the tbWtbW vertex in the Standard Model (SM). For both tensorial couplings we find that the real part of the electroweak SM correction is close to 10% of the leading contribution given by the QCD gluon exchange. We also find that the electroweak real and imaginary parts for the anomalous right coupling are almost of the same order of magnitude. The one loop SM prediction for the real part of the left coupling is close to the 3σ\sigma discovery limit derived from bsγb\rightarrow s \gamma. Besides, taking into account that the predictions of new physics interactions are also at the level of a few percents when compared with the one loop QCD gluon exchange, these electroweak corrections should be taken into account in order to disentangle new physics effects from the standard ones. These anomalous tensorial couplings of the top quark will be investigated at the LHC in the near future where sensitivity to these contributions may be achieved.Comment: 16 pages, 2 figure

    Analysis of nanopore detector measurements using Machine-Learning methods, with application to single-molecule kinetic analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A nanopore detector has a nanometer-scale trans-membrane channel across which a potential difference is established, resulting in an ionic current through the channel in the pA-nA range. A distinctive channel current blockade signal is created as individually "captured" DNA molecules interact with the channel and modulate the channel's ionic current. The nanopore detector is sensitive enough that nearly identical DNA molecules can be classified with very high accuracy using machine learning techniques such as Hidden Markov Models (HMMs) and Support Vector Machines (SVMs).</p> <p>Results</p> <p>A non-standard implementation of an HMM, emission inversion, is used for improved classification. Additional features are considered for the feature vector employed by the SVM for classification as well: The addition of a single feature representing spike density is shown to notably improve classification results. Another, much larger, feature set expansion was studied (2500 additional features instead of 1), deriving from including all the HMM's transition probabilities. The expanded features can introduce redundant, noisy information (as well as diagnostic information) into the current feature set, and thus degrade classification performance. A hybrid Adaptive Boosting approach was used for feature selection to alleviate this problem.</p> <p>Conclusion</p> <p>The methods shown here, for more informed feature extraction, improve both classification and provide biologists and chemists with tools for obtaining a better understanding of the kinetic properties of molecules of interest.</p

    Prediction of DNA-binding propensity of proteins by the ball-histogram method using automatic template search

    Get PDF
    We contribute a novel, ball-histogram approach to DNA-binding propensity prediction of proteins. Unlike state-of-the-art methods based on constructing an ad-hoc set of features describing physicochemical properties of the proteins, the ball-histogram technique enables a systematic, Monte-Carlo exploration of the spatial distribution of amino acids complying with automatically selected properties. This exploration yields a model for the prediction of DNA binding propensity. We validate our method in prediction experiments, improving on state-of-the-art accuracies. Moreover, our method also provides interpretable features involving spatial distributions of selected amino acids

    M3G: Maximum Margin Microarray Gridding

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Complementary DNA (cDNA) microarrays are a well established technology for studying gene expression. A microarray image is obtained by laser scanning a hybridized cDNA microarray, which consists of thousands of spots representing chains of cDNA sequences, arranged in a two-dimensional array. The separation of the spots into distinct cells is widely known as microarray image gridding.</p> <p>Methods</p> <p>In this paper we propose M<sup>3</sup>G, a novel method for automatic gridding of cDNA microarray images based on the maximization of the margin between the rows and the columns of the spots. Initially the microarray image rotation is estimated and then a pre-processing algorithm is applied for a rough spot detection. In order to diminish the effect of artefacts, only a subset of the detected spots is selected by matching the distribution of the spot sizes to the normal distribution. Then, a set of grid lines is placed on the image in order to separate each pair of consecutive rows and columns of the selected spots. The optimal positioning of the lines is determined by maximizing the margin between these rows and columns by using a maximum margin linear classifier, effectively facilitating the localization of the spots.</p> <p>Results</p> <p>The experimental evaluation was based on a reference set of microarray images containing more than two million spots in total. The results show that M<sup>3</sup>G outperforms state of the art methods, demonstrating robustness in the presence of noise and artefacts. More than 98% of the spots reside completely inside their respective grid cells, whereas the mean distance between the spot center and the grid cell center is 1.2 pixels.</p> <p>Conclusions</p> <p>The proposed method performs highly accurate gridding in the presence of noise and artefacts, while taking into account the input image rotation. Thus, it provides the potential of achieving perfect gridding for the vast majority of the spots.</p

    Texture analysis-and support vector machine-assisted diffusional kurtosis imaging may allow in vivo gliomas grading and IDH-mutation status prediction:a preliminary study

    Get PDF
    We sought to investigate, whether texture analysis of diffusional kurtosis imaging (DKI) enhanced by support vector machine (SVM) analysis may provide biomarkers for gliomas staging and detection of the IDH mutation. First-order statistics and texture feature extraction were performed in 37 patients on both conventional (FLAIR) and mean diffusional kurtosis (MDK) images and recursive feature elimination (RFE) methodology based on SVM was employed to select the most discriminative diagnostic biomarkers. The first-order statistics demonstrated significantly lower MDK values in the IDH-mutant tumors. This resulted in 81.1% accuracy (sensitivity = 0.96, specificity = 0.45, AUC 0.59) for IDH mutation diagnosis. There were non-significant differences in average MDK and skewness among the different tumour grades. When texture analysis and SVM were utilized, the grading accuracy achieved by DKI biomarkers was 78.1% (sensitivity 0.77, specificity 0.79, AUC 0.79); the prediction accuracy for IDH mutation reached 83.8% (sensitivity 0.96, specificity 0.55, AUC 0.87). For the IDH mutation task, DKI outperformed significantly the FLAIR imaging. When using selected biomarkers after RFE, the prediction accuracy achieved 83.8% (sensitivity 0.92, specificity 0.64, AUC 0.88). These findings demonstrate the superiority of DKI enhanced by texture analysis and SVM, compared to conventional imaging, for gliomas staging and prediction of IDH mutational status

    ANGLOR: A Composite Machine-Learning Algorithm for Protein Backbone Torsion Angle Prediction

    Get PDF
    We developed a composite machine-learning based algorithm, called ANGLOR, to predict real-value protein backbone torsion angles from amino acid sequences. The input features of ANGLOR include sequence profiles, predicted secondary structure and solvent accessibility. In a large-scale benchmarking test, the mean absolute error (MAE) of the phi/psi prediction is 28°/46°, which is ∼10% lower than that generated by software in literature. The prediction is statistically different from a random predictor (or a purely secondary-structure-based predictor) with p-value <1.0×10−300 (or <1.0×10−148) by Wilcoxon signed rank test. For some residues (ILE, LEU, PRO and VAL) and especially the residues in helix and buried regions, the MAE of phi angles is much smaller (10–20°) than that in other environments. Thus, although the average accuracy of the ANGLOR prediction is still low, the portion of the accurately predicted dihedral angles may be useful in assisting protein fold recognition and ab initio 3D structure modeling
    corecore